Using P2P Computing and Mobile Agents For Web Information Retrieval: A Framework Design

نویسنده

  • Yihua Sheng
چکیده

The Internet has tremendous amount of information to be readily gathered and mined. But lack of useful tools for gathering and mining information makes the information under-utilized. Also, it would be very nice if an individual could have his own information gathering and filtering system so that he can freely tweak information according to his own needs. But building and maintaining a reliable and efficient web information retrieval system (WIRS) is associated with a very high cost; reliable and durable software and hardware, a large number of computers for web crawling, and high speed network connection are the minimum requirements. This paper proposes a lightweight information retrieval system based on mobile agents and peer-to-peer (P2P) computing technologies to drastically reduce the cost. P2P computing is a new networking paradigm where two or more clients work together as equals. Harnessing the unused processing power and storage of computers on the Internet could deliver supercomputing capabilities at a fraction of the current cost. The SETI@home project and Naspter are two successful stories. A mobile agent is a persistent entity (i.e. it can outlive the application it originates from) which is typically limited in size and most importantly is able to migrate, i.e. to suspend its execution, move to another location, and continue execution there. When a mobile agent migrates, it has to carry its data state (variables) as well as execution state (active threads). An execution environment for mobile agents, called an agency, will be voluntarily installed on each participating user’s computer in the P2P computing pool. The function of the agency is to accept, run, and move mobile agents. Four different kinds of mobile agents are present in this system: Web Crawling Agent (WCA), Information Filtering Agent (IFA), Manager Agent (MA), and Report Generation Agent (RCA). WCAs are responsible for retrieving information from different sources on the Internet. The input to a WCA is a list of Internet URL addresses needed to crawl. The output of a WCA is the contents of web pages crawled. IFAs are responsible for filtering information received from WCA. The precision of the information returned largely relies on IFA. An IFA decides if the information from WCAs should be discarded or sent back as useful information. A single IFA can receive and filter information from multiple WCAs. MAs are responsible for managing mobile agents (WCAs and IFAs) in the system, including themselves. Each MA maintains a reasonable number of other agents in the system. RCAs are responsible for generating final reports based on the returned web pages stored in the database. Report templates are created and stored in a Report Templates Database. Users can define their own templates and store them in the database. To some extent, templates guide the activities of web crawling and information filtering. In addition to these mobile agents, a stationary console agent (CA) is also developed and installed with the agency in users’ computers. The coordination and collaboration among these agents are described below: 1. A user uses the CA to start an information retrieval job, such as traversing a set of websites and their related sites to extract information on a particular product. 2. After receiving a search job from a user, the CA analyzes the nature of the job and then splits it into smaller pieces. MAs are created for each piece of the job and dispatched to the P2P computing pool for execution. 3. MAs then create a series of WCAs and IFAs and send them out to the computers in the same pool. 4. The WCAs crawl websites and send the web pages to their corresponding IFAs. Peer-to-Peer Computing Services 180

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Behavioral Considerations in Developing Web Information Systems: User-centered Design Agenda

The current paper explores designing a web information retrieval system regarding the searching behavior of users in real and everyday life. Designing an information system that is closely linked to human behavior is equally important for providers and the end users.  From an Information Science point of view, four approaches in designing information retrieval systems were identified as system-...

متن کامل

Beyond Term Indexing: A P2P Framework for Web Information Retrieval

Web search over peer-to-peer (P2P) networks shows promise to become an alternative to the state-of-the-art search engines since P2P overlays offer means for decentralized search across widely-distributed document collections. However, the design of effective techniques for P2P indexing and retrieval raises a number of technical challenges due to potentially unscalable resource (e.g. bandwidth, ...

متن کامل

User Interface Design in Mobile Educational Applications

Introduction: User interfaces are a crucial factor in ensuring the success of mobile applications. Mobile Educational Applications not only provide flexibility in learning, but also allow learners to learn at any time and any place. The purpose of this article is to investigate the effective factors affecting the design of the user interface in mobile educational applications. Methods: Quantita...

متن کامل

A Distributed Software Environment for Collaborative Web Computing

This paper describes an extensible core software element of a distributed, peer-to-peer system, which provides several facilities in order to help the implementation of collaborative, Web-based, distributed information storing and retrieval applications based on a decentralized P2P model. Moreover, after an architectural introduction of the core distributed software module, the Core Node, this ...

متن کامل

Proactively Composing Web Services as Tasks by Semantic Web Agents

The chapter presents the framework for agent-enabled dynamic composition of Semantic Web Services. The approach and the framework have been developed in several research and development projects by ISRG and IOG. The core of the methodology is the new understanding of a Semantic Web Service as a capability of an intelligent software agent supplied with the proper ontological description. It is d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003